160 research outputs found

    On the Finite Time Convergence of Cyclic Coordinate Descent Methods

    Full text link
    Cyclic coordinate descent is a classic optimization method that has witnessed a resurgence of interest in machine learning. Reasons for this include its simplicity, speed and stability, as well as its competitive performance on β„“1\ell_1 regularized smooth optimization problems. Surprisingly, very little is known about its finite time convergence behavior on these problems. Most existing results either just prove convergence or provide asymptotic rates. We fill this gap in the literature by proving O(1/k)O(1/k) convergence rates (where kk is the iteration counter) for two variants of cyclic coordinate descent under an isotonicity assumption. Our analysis proceeds by comparing the objective values attained by the two variants with each other, as well as with the gradient descent algorithm. We show that the iterates generated by the cyclic coordinate descent methods remain better than those of gradient descent uniformly over time.Comment: 20 page

    Fighting Bandits with a New Kind of Smoothness

    Full text link
    We define a novel family of algorithms for the adversarial multi-armed bandit problem, and provide a simple analysis technique based on convex smoothing. We prove two main results. First, we show that regularization via the \emph{Tsallis entropy}, which includes EXP3 as a special case, achieves the Θ(TN)\Theta(\sqrt{TN}) minimax regret. Second, we show that a wide class of perturbation methods achieve a near-optimal regret as low as O(TNlog⁑N)O(\sqrt{TN \log N}) if the perturbation distribution has a bounded hazard rate. For example, the Gumbel, Weibull, Frechet, Pareto, and Gamma distributions all satisfy this key property.Comment: In Proceedings of NIPS, 201
    • …
    corecore